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Abstract. This article investigates the super-resolution phenomenon using the 
celebrated statistical estimator LASSO in the complex valued measure framework. 
More precisely, we study the recovery of a discrete measure (spike train) from few 
noisy observations (Fourier samples, moments, Stieltjes transformation...). In par- 
ticular, we provide an explicit quantitative localization of the spikes. Moreover, 
our analysis is based on the Rice method and provide an upper bound on the 
supremum of white noise perturbation in the measure space. 



1. Introduction 

1.1. Super-resolution. In some situations, experiments can be subject to device 
limitations where one cannot observe enough information in order to recover 
fine details from an image. For instance, in optical imaging, the physical limita- 
tions are evaluated by the resolution. This latter measures the minimal distance 
between lines that can be distinguished. Hence, the details below the resolu- 
tion limit seem unreachable. The super-resolution phenomenon is the ability to 
recover the information beyond the physical limitations. Surprisingly, if the ob- 
ject of interest is simple (e.g. a discrete measure) then it is possible to override 
the resolution limit. In particular, the reader may think of important questions 
in applied harmonic analysis such as the problem of breaking the diffraction 
limit of an optical system or the issues arising in source separation. Many com- 
panion applications in astronomy, medical imaging and microscopy are at stake 
[Don92, CFG12b, CFG12a] and theoretical guarantees of source detection are of 
crucial importance in practice. For instance, the idea of this paper gives a process 
to compute a quantitative estimate of the localization of the active molecules in 
Single Molecule Imaging in 3D Microscopy [SBC + 12]. 

This paper offers quantitative detection guarantees from noisy observations 
(Fourier samples, moments, Stieltjes transformation...). The authors provide a 
tractable algorithm (BLASSO) and quantitative estimates of a train of complex 
valued spikes from few noisy observations. 

Similarly, P. Doukhan, E. Gassiat and one author of this present paper 
[DG96, GG96] considered the exact reconstruction of a nonnegative measure. 
More precisely, they derived results when one only knows the values of a fi- 
nite number of linear functionals at the target measure. Moreover, they study 
stability with respect to a metric for weak convergence. 

Likewise, two authors of this paper [dCG12] proved that k spikes trains can 
be faithfully resolved from m = 2k + 1 samples (Fourier, Stieltjes transformation, 
Laplace transform, ...) by using total-variation method. 
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Last but not least, our analysis involves an estimate of the magnitude of the 
noise perturbation in the signal domain using the Rice method, see for exam- 
ple [AW08]. In particular, we derive explicit bounds for the tunning parameter 
appearing in BLASSO. 

1.2. General model and notation. Let T be a compact set homeomorphic to ei- 
ther the interval [0, 1] or the unit circle S 1 (which is identified as 1R mod (2n) via 
the mapping z = e lt ). Let A be a complex measure on T with discrete support 
of (unknown) size s. In particular, A has polar decomposition (see [Rud87] for a 
definition): 

s 

A = ]T A fc exp(i0 fc ) S Tk , 

k=l 

where A/ c > 0, 9 k G 1R, T k £ T for k = 1, . . . ,s and S x denotes the Dirac measure 
at point x. 

Let m be a positive integer and T = {q>o, (p\, . . . , q> m } be a family of complex 
continuous functions on T. Define the A:-th generalized moment of a complex 
measure y on T as: 

c fc(f) = / Vkdji, 
Jt 

for all the indices k = 0, 1, . . . , m. Assume that we observe y = (yk) k= Q defined as: 

V*€{0,1,. ..,«}, y k = c k (A) + e k/ 
where e = (£/c)jtLo * s a com pl ex valued white noise. This can be written as: 

i/ = / O dA + e , 

JT 

where <E> = (cpQ, . . . , <p m ). We aim at reconstructing the complex measure A from 
the m + 1 measurements given by y. 

Remark. Along this article, we shall mention examples in the Fourier case (Fourier 
samples) or in the polynomial case (moment samples), notation would be de- 
scribed therein. If not specified, notation are in accordance with the general 
model. 

1.3. Beurling LASSO (BLASSO). Denote by M the set of finite complex mea- 
sures on T and by || . \\ TV the total variation norm. We recall that for all \i £ M, 

\\v\\tv = su p E / 

n Een 

where the supremum is taken over all partitions FI of T into a finite number of 
disjoint measurable subsets. For further details, we refer the reader to [Rud87]. 

Remark. We mention that the TV-norm considered in this paper is not the usual 
TV-norm of signal processing which is essentially the ^i-norm of the ^2" norms 
of the finite differences at any point. In particular, our model has nothing to do 
with the Rudin-Osher-Fatemi (ROF) model [ROF92]. 

By analogy with the LASSO [Tib96], Beurling LASSO (BLASSO) is the process of 
reconstructing a discrete measure A from the samples y by finding a solution to: 

(BLASSO) Aeargmin-||/ <E>dM - y\\\ +k\\u\\ TV , 

where A is a tuning parameter. 
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Remark. For the case of Fourier coefficients and £ = 0, (BLASSO) is simply Beurl- 
ing Minimal Extrapolation [Beu38]. Moreover, in the finite dimension framework 
(i.e. T should be viewed as R"), BLASSO is nothing else than LASSO. BLASSO 
is named after this remark. 

Remark. The factor 1/2 in the definition of BLASSO plays no role, except for 
simplifying the proofs. 

1.4. Detection from noisy Fourier samples. In this subsection, we mention the 
example of Fourier samples to illustrate our results. Recently, much emphasis 
has been put on the recovery of a spike train (discrete measure) from noisy band- 
limited data [CFG12a]. In this setting, we observe noisy Fourier samples up until 
a frequency cut-off f c € N* . We shall specify notation: 

• The number of samples is 2/ c + 1 hence m = 2f c . 

• For sake of simplicity, we place ourselves on T = [0,1]. 

• For all k 6 {—fc, ■ ■ - /fc}, we set for all x 6 [0, 1], cp^{x) = exp(i27iA:x), and 

* = (?-f C " ■•>%)■ 

• Assume are random complex Gaussian: 

« t =4 1) +i4 2) . 

where ej!, ei , k £ {—f c , . . . ,f c } are i.i.d. centered Gaussian random 
variables with standard deviation a: 

We mention that e = ( £ -/ c / ■ ■ ■ / £ / c )- 

• Finally, we recall that we observe y = J T O dA + e. 

Our results show that if the spikes (or atoms) are sufficiently separated, at least 
2 /f c apart, then one can detect some point sources with a known precision solv- 
ing a simple convex optimization program. 

Definition 1.1 (Minimum separation [CFG12b]). For a family of points Scl, the 
minimum separation is defined as the closest distance between any two elements from S: 

£(S) = inf \x — x'\ . 

(x,x')es 2 
x£x' 

We emphasis that the distance is taken around the circle so that, for example 1 5/6 — 
1/6|= 1/3. 

In this framework, we have the following theorem that quantifies the BLASSO 
stability. 

Theorem 1.1. Let Abe a discrete measure such that: 

*(Supp(A)) > \, 

where Supp(A) denotes the support of A. Let Abe a solution to (BLASSO) with tuning 
parameter A such that: 



A> A F :=0yi2/ c log(/ c 
then, with probability greater than 

l-8exp(-A 2 /A|) 
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Figure 1. The problem is the following: we aim at recovering 
some spikes of the original signal (fives stars *) from the obser- 
vation through a corrupted optical device (blue line) which can 
differ heavily from the true noiseless observation (black dotted 
line). Our procedure (red circles) provides a close estimate of the 
location of some spikes. 



the following holds. For all t 6 [0, 1] such that: 

|A({t})|>^, 

there exists a unique T 6 Supp(A) satisfying: 

_ / 2A \i/2 i 0.16 

f-T<( — < — — , 

1 '- Vc a |A({i})K fc~ fc 

where < Q, < (0.16) 2 C fl < 1 are universal constants. 

Remark. Furthermore, a solution A to BLASSO can be efficiently computed using 
a companion SDP program. We refer the reader to Section 6.2 for further details. 

This result is new and interesting because it provides a quantitative estimate of 
the location of spikes. To the best of our knowledge, this is the first result of this 
kind in the literature. 

Remark. Observe that our procedure do not suppose any knowledge on the total 
number of spikes s. This property is of great importance in actual practice. Only 
the minimal distance between any pair of atoms is relevant to BLASSO. 

Remark. Unlike the finite dimension case [BVDG11] (where T should be viewed 
as R" and BLASSO is simply LASSO), we have numerically witnessed to the 
following fact: the solutions A to BLASSO have generally less atoms than the 
target A (i.e. the size of the estimated support is often less or equal than the size 
of the true support). For instance, Figure 1 shows that the BLASSO solution has 
support of size 3 while the target has support of size 5. It seems that there is 
no hope to recover all atoms of the target measure. Hence, we can only hope 
to recover the low hanging fruits, namely the large atoms. Hence, our result 
quantify the localization of these large atoms. 
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1.5. Detection from noisy moment samples. In the frame of a polynomial sys- 
tem, a companion corollary of Theorem 1.1 can be given using the Rice inequality 
given in Proposition 5.1 (that gives a way of tunning A) and the Szego mapping 
(as done in [dCG12]). Notice that, in this case the minimal spacing between the 
support points is no more uniform as the Szego induced a non linear distortion. 

1.6. Comparison with related work. The problem of super-resolution without 
noise has been investigated in numerous articles [Don92, DG96, GG96] (this list 
is not meant to be exhaustive). In particular, we mention [dCG12, CFG12b] 
which exhibit the notion of dual certificates in the measure framework. Hence, in 
[dCG12] the authors investigates TV-norm minimization with different types of 
measurements: trigonometric, polynomial, Laplace transform... Furthermore, the 
article [CFG12b] provides an explicit construction of a tight dual certificate P in 
the Fourier case and gives an upper bound on the magnitude of P at each point. 
We mention [BP10] which aim at approximating the solution by estimating the 
support of the signal in an iterative fashion. 

In the case of noisy measurements, [CFG12a] derives a stability result for a 
weighted t\ distance between measures (the weight function is given by a high- 
frequency kernel). In contrast, our result result provides a quantitative localiza- 
tion of the spike train, which is crucial in applications. 

To the best of our knowledge, this paper is the first work on a quantitative 
detection of atoms. 

1.7. Organization of the paper. The next section present a general definition of 
measures that can be detected using BLASSO. Section 3 gives the main result 
and a key lemma on the localization of the solution to BLASSO. Section 4 pro- 
vides some examples of target measures that can be considered in our frame- 
work. Section 5 presents some results on Gaussian processes that are useful in 
the super-resolution framework. The last part is devoted to the proofs. 

2. Separable measures (SM) 

As a matter of fact, our framework deals with more general samplings (or obser- 
vations). We begin with the definition of a dual certificate (see [dCG12, CFG12b]). 

Definition 2.1 ((Tight) Dual certificate). We say that a linear combination 

m 

p = E 

k=0 

is a dual certificate of a discrete measure which polar decomposition is given by: 

n 

F = E Fkexp(iOk)$x k , 

k=l 

where u^ > 0, if in addition 

• V/ce {l,...,n} 7 P(* fc ) = exp(-ifl k ), 

• \f x eT , \P(x)\< 1. 

Similarly, we say that P is a "tight dual certificate" if P is a dual certificate and 

. VxeT\{x 1/ ... / x„}, |P(x)|<l. 
The set of all the dual certificates (resp. tight dual certificates) of a measure }i is denoted 
by V(y) (resp. Vo(ji)). 

We use the dual certificate to give the definition of SM measures. 
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Definition 2.2 ((Tight) Separable Measure ((T)SM)). We say that a discrete measure 
u is a separable measure (SM) (resp. tight separable measure (TSM)) with respect to 
a family T = {cpo, cp\, . . . , cp m } if it has at least a dual certificate (resp. tight dual 
certificates). 

Remark. Observe that every solution to a total-variation regularization method is 
SM. Indeed, one can prove that the target A is a solution of the following total- 
variation method: 

(GME) A GME 6 are min ||p || 7v s.t. / Odii= / OdA. 

jieM Jt Jt 

if and only if A is SM (a proof can be found in [dCG12]). We understand that 
if A is not SM there is no hope in recovering A with a total-variation method. 
Moreover, one can prove [dCG12] that if the target A is TSM then it is the unique 
solution to (GME). 

This remark shows that TSM is a natural assumption in TV-minimization. 
• Assumption: From now on, we assume that A is TSM. 

3. Main result 

3.1. Confinement. The analysis of BLASSO uses the extremal properties of the 
TV-norm. In particular, it is known that the extreme points of the unit ball of the 
TV-norm are the atoms S x for any x G T. Thus the TV-norm minimization forces 
the solutions to be discrete measures. Furthermore, we recall that: 

VveM, ||v||tv= sup 3?( / fdp) , 

ll/ll.<i Jt 

where K(-) denotes the real part. It follows that the sub-gradient of the TV-norm 
is given by: 

3||-Htv(v) := {/e L°°(T); \\ ? \\ TV -\\ V \\ TV -U( J^f d(v - p)) > , Vp e Ai) . 
We mention that this sub-differential can be further characterized as: 
9|Hlrv(v) := {/ e L°°(T); ft(jf/dv) = ||v|| T y, 

and&( jT/d/t) < ||p||tv, Vp e M\ 

Another tool we shall use in our theorems is the notion of Bregman divergence. 
It is defined as follows. 

Definition 3.1 (Bregman divergence). Let ]i and v be two complex measures such that 
v is TSM. Define the Bregman divergence as: 

(3.1) D(ji,v) = {\\ii\\Tv-llt(JPdit); PeP (v)}, 
We recall that Vo(v) is the set of the tight dual certificates of v. 

Remark. One can check that the Bregman divergence at point {}i,v) is an interval 
of [0, +oo [. 

Remark. Strictly speaking, the "Bregman divergence" considered in this paper is 
not the usual Bregman divergence (see [BP10] for instance) since we restrict P to 
Vo{v) while the standard Bregman divergence for the total- variation norm should 
be: 

(3.2) {\\H\\TV-St(f Pdji); K(^Pdv) = ||i/|| Ty and ||P||co=l}. 
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As a matter of fact, (3.1) is the intersection of (3.2) with the set of all possible 
linear combinations of the family T . 

Bregman divergence is not a common distance functional since it does not satisfy 
the triangle inequality and it is not symmetric. However, Bregman divergence is 
non-negative and allows us to localize the support of discrete measures. The next 
lemma shows that measures with small divergence are "close". 

Lemma 3.1 (Confinement). Let u and v be two complex measures such that v is TSM. 
Assume that ]i is discrete with a polar decomposition u = Hk ex p(i9k)dx k , where 
u k > 0. We have the following properties: 

• Let d be in D(u,v) then 

n 

d=E n [1 - l p l(*/c) cos(0 fc + 9p(x k ))] , 

k=l 

where 

P(x) = |P|(x)exp(i0p(je)), 
is a polar decomposition of a tight dual certificate P (at point x) of v. 

• In particular, for allk = l,...,n, 

PeVofv) 

and 

6 k e arccos(J fc ) ffl f| {(-6 F )(\P\-\l k )) ffl {2pn; p e Z}} , 

PeV (v) 

where I k = [1 — (d/ u k ), 1] and ffl denotes the Minkowski sum. 

A proof of this lemma can be found in Section A.l. In other words, the support 
{x 6 T; |f/({x})|> p} is included in the level set: 

n \p\-\[i-{d/ P ),i\), 

and the support {8 ; 3ieT s.t. u({x}) = p x exp(i#) and u x > p} is included in 
the level set: 

arccos([l-(rf/p),l])ffl f| {(-0 P )(|P|- 1 ([1 - {dip), 1])) ffl {2pn; p E Z}} , 

Pe9o(v) 

where EH denotes the Minkowski sum. 
3.2. Main theorem. 

Theorem 3.2. Let Abe a Tight Separable Measure such that: 

in 

P = E a k<Pk , 
k=0 



is a tight dual certificate. Let Abe a solution to (BLASSO) then it holds: 
(3.3) D(A, A) < min 
Moreover, if the tuning parameter A is such that: 



^Nli+iK((/ T <I'd(A-A), £ ));|||«-|g 



A > A := || J2 £ k?k 
k=0 
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then it holds: 

(3.4) D(A, A) < ||fl||2 V /4A||A|| TV + -L||e||i. 

A proof of this theorem can be found in Section A.3. 

4. Examples 

4.1. Nonnegative measures and standard moments. The nonnegative measures 
whose support has size s not greater than m / 2 are tight separable measures with 
respect to the standard moments, i.e. <Pk( x ) — * • Indeed, let A be a nonnegative 
measure and S = {T\, . . ., T s } be its support. Following [BE95], set 

P(x) = l-cfl(x-T i ) 2 . 

i=l 

Then, for a sufficiently small value of the parameter c, the polynomial P has 
supremum norm not greater than 1. The existence of such a polynomial shows 
that the measure A is TSM [dCG12]. 



4.2. Nonnegative measures and the Stieltjes transformation. Any nonnegative 
measure whose support has size s not greater than m/2 are SM [dCG12] with 
respect to the family 

^={1,— ,...}, 

L Z\ — X Z2 — X ' 

where none of the z k 's belongs to T. 



4.3. Chebyshev measures and standard moments. Define the Chebyshev poly- 
nomials of the first order as: 

T k (x) = cos(/carccos(x)), Vx G [—1,1] ■ 

It is well known (see [BE95] for instance) that it has supremum norm not greater 
than 1, and that 

• T k is equal to 1 on { cos(2/7i/A:), / = 0, . . ., [|J }, 

• T k is equal to -1 on { cos((2Z + l)n/k), I = 0,..., [|J }, 

whenever k > 0. Then, any real measure A which Jordan decomposition is given 
by A = A+ - A~, and such that 

• Supp(A+) C {cos(2/7i/fc), I = 0,..., L|J}, 

• Supp(A-) C {cos((2/ + l)7r/fc), 1 = 0,..., L|J}, 

for some < k < n, is SM. 



4.4. Minimum separation measures. Last but not least, E.J. Candes and C. 
Fernandez-Granda [CFG12b, CFG12a] have shown that tight dual certificates 
(with respect to the Fourier basis) exist for discrete complex measure satisfying 
a "minimum separation condition". Their Proposition 2.1 [CFG12b] and Lemma 
2.4 [CFG12a] give an explicit construction using the Fejer kernel. 
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5. Rice method 

5.1. Polynomial case. We consider the Gaussian process X m (f), t E [0, 1], defined 
by: 

X m (0 = £ + £if + £2f 2 + ... + £ m f\ 

where £i, . . . , £ m are i.i.d. standard normal. Its covariance function is: 

r(s, t) = l+st + s 2 t 2 + ... + s m t m , 
where the dependence in m has been omitted. Its maximal variance is: 



Its variance function is: 



a 2 =m + 1. 



4(f) = \ + t 2 + t i + ... + t 2m . 



Proposition 5.1. Let M = max fe [ 01 ] |X m (f)|. Then, for u > 2^m~+T, 
(m + l)y/riu + m\p2 



P{M > u} < 2 



xp(u /Vrn + 1)+ 2(1 -Y(u)) ; 



2-^/Trm 

ztfere i/> and Y are respectively the standard normal density and distribution . 
For sake of completeness, a proof can be found in A.2. 
5.2. Fourier case. We consider the trigonometric functions: 

<p k (t) = exp(i2nkt), t e [0,1] and/c e K := {-f c , . . . ,f c } , 
and random complex Gaussian errors: 

where the variables e^,e| c 2 ',A: 6 JC are independent with standard normal distri- 
bution. 

Proposition 5.2. Let Z(f) = EjteAC e Jt<Pfc(0- Then, for u > \/2, 



F { -p ||Z«)|| > »} < 4(exp(-^) + /^exp ( - ^)) . 

For sake of completeness, a proof can be found in A. 2. 

6. Numerical experiments 

6.1. Fenchel dual program. The usual convex analysis shows that (BLASSO) can 
be viewed as a Fenchel dual problem (see [Zal02, BP10] for a definition). As 
a matter of fact, any solution to (BLASSO) can be faithfully computed from a 
companion program that builds a dual certificate of A. 

Proposition 6.1 ([Zal02, BP10]). The problem 

I l|2 

(61) +I {« e C« +1 ; IKLo«mll»<A}( fl ) 

Ajas zfs Fenchel dual with the same minimizers as (BLASSO). Here, the indicator Ie(v) 
of a set E c C m+1 is defined by Ie(t>) — if v G E ««d Ie(^) — + 00 otherwise. 

Using the predual problem (6.1), it is possible to derive optimality conditions for 
(BLASSO). Hence, we mention that all solution to (BLASSO) is SM as shown by 
Proposition 3 in [BP10] (their analysis extends naturally to the complex field). 
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Proposition 6.2 ([BP10]). The optimization problem (BLASSO) admits at least a so- 
lution. Moreover, all solution A is SM and it has a dual certificate P = E/'Lo a kfk 
where 



(6.2) Vfce{0 m}, h = —, Vk ■ 

A 

Remark. We have an explicit formulation of a dual certificate P of A using (6.2). 
Moreover, all solution to (BLASSO) is discrete, SM and satisfies: 

(6.3) {ieT; |A({x})|> 0} C {xeT; |P|(x) = 1} . 

In other words, the support of A is included in the set of the points for which \P\ 
is maximal. 

On the algorithmic side, the program (6.1) allows us to compute a dual certificate 
of a solution A to (BLASSO). As a matter of fact, it takes the form: 

V m 
(Dual BLASSO) a £ arg min \\a — ^-\\\ subject to || flfc<P/c||oo< 1 ■ 

„ec m+1 A 

By definition of a dual certificate, the support of A is located at the points where 
the dual certificate P = Y^k=0®kfk has modulus equal to 1, its maximal value. 
Once the support is estimated accurately, a solution to (BLASSO) can be found 
by solving a well-posed linear problem. 

6.2. Fourier Case. In this subsection, we follow notation of the Fourier case de- 
scribed in Section 1.4. At first glance, the program (Dual BLASSO) seems difficult 
to solve due to the norm in the hard constraint. As pointed out by [CFG12b], 
this difficulty can be circumvented using the following lemma. 

Lemma 6.3 (Corollary to Theorem 4.24 in [Dum07]). A trigonometric polynomial: 

fc 

P = a k exp(i2nkt) 

k=-f c 

is bounded by one in magnitude if and only if there exists a Hermitian matrix Q £ 
C(»+ 1 ) x («+ 1 ) satisfying: 

(6.4) (Q f) h0 and 

We deduce that (Dual BLASSO) is equivalent to the following Semi-Definite Pro- 
gram: 

(6.5) a £ arg min ||« — vlll subject to (6.4) . 

aec m+1 A 

This program gives the coefficients of a dual certificate P of A. The level set of 
the maximum value of \P\ contains the support of A, see (6.3). It suffices to solve 
a regular LASSO on these points to deduce the values of the weights of A. 

Remark. As pointed out by [CFG12b], it is not clear that P is not constant and 
hence the level set of the maximum value of \P\ is discrete. However, we have 
run several numerical experiments, they all show that P is not constant. We 
devote the theoretical analysis of this fact to future work. 
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Appendix A. Proofs 
A.l. Lemma 3.1. The lemma follows from the identity: 



= Eft [i — |JP|(xfc) cos(e k + e P ( Xk ))], 

k=l 

hence 

|p|(x k ) cos(e k + e P (x k )) >i — . 

We conclude considering the level sets associated with the value 1 — (d/pi k ). 
A. 2. Rice formula. 

Polynomial case. By the Rice method [AW08], for u > 0, 

P{M > u} < 2P{ max X m (t) > u} , 
fe[0,l] 

<2P{X„,(0) > u + 2E(U M [0,l])}, 

= 2(1 - Y(h)) +2 | o 1 E((X;„(0) + |X m (0 = ")^(t)(")*/ 

where U u is the number of crossings of the level u and ip a is the density of the 
centered normal distribution with standard error a. Regression formulas implies 
that: 

E(XUt)\X m (t)=u) = r -^^u, 
Var (X' m (t)\X m (t) =u)< Var(X / m (t)) = r u (t,t), 
where, for instance r\ \ (s, t) = — gfe^ . We have: 

roA(t/0 = f + 2t 3 + --- + mf 2m - 1 , 
r w (f, f ) = 1 + 4f 2 + ■ ■ ■ + m 2 t 2m - 2 . 

On the other hand, if Z ~ N(}i,cr 2 ) then 

E(Z+) = }C¥{}i/o-) +cnp(}i/cr) < y+ + 

We get that: 

/ r 1 t + 2f 3 + ■ ■ ■ + mt 2 "^ 1 
P{M > u} < 2(1 - *(«)) +2(/ o + | + - + f2m ^ (fl (u)rft 

+ Vl + 4f 2 + ■ ■ ■ + m 2 t 2 "- 2 ip am{t) {u)dt) 

\>2n JO ' 

:=2(1-Y(m))+2(A + B) 
We use the following straightforward relations: 

• for <T\ <Ui< U, l/V-^w) < lpcr 2 {u), 

• for u > 2 and c < 1, a~ 2 ip a (u) < ip(u), 

• and 



'In 



sJA-y + ■■■ Am < y/Ai + V^JA 
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Eventually, we get, for u > 2\/ m + 1: 



m + 1 . . 
A < —j— UfamW, 



and we are done. □ 
Fourier case. We have: 

Z(0 = 4 1} + E (4^ + £ -it) cos(2tt/c0 + (ef - e^ k ) sm(2nkt) 



k=l 



+ 1 



4 2) + E (4 2) + e -l) cos(27rfc0 + ( £ W - £ W) sin(27r/cf) 



fc=l 

One can see that Z(f) = X(f) +i Y(f) where X(f) and Y(f) are two independent 
Gaussian stationary processes with the same auto-covariance function: 

fc 

T(t) = 1 + 2 £ cos(27rifcf) = D fc (t) , 
k=l 

where Df c (t) denotes the Dirichlet Kernel. Set: 

a m 2 = Var(X(f)) = D /c (0) = 2/ c + 1 . 
We use the following inequalities: 

P{||Z||co > u} <P{||X||co > «/V2} + P{||Y||co > u/V2}, 
= 2P{||X||co > u/V2}. 

and 

(A.l) F{||X||oo > u/V2} < 2P{ sup X(f) > u/V2} . 

te[0,l] 

To give bounds to the right hand side of (A.l), we use the Rice method [AW08] 
using the fact that the process X(t) (for example) is periodic with r( 2 ^ +1 ) = 0: 

P{ sup X(f) > u/V2} =P{Vf e [0,1]; X(t) > u/y/l} + F{U u/ « > 0}, 

f€[0,l] 

< (V( M /(V2cr m ))^ +E(U B/ ^), 

where U c is the number of up-crossings of the level v by the process X(f) on the 
interval [0, 1] and Y is the tail of the standard normal distribution. By the Rice 
formula: 

P(^ ) = ^ V ^r^i-exp(- i ^ ) , 

where: 

fc 2 

Var(X'(f)) = -r"(0) = 2(27r) 2 ± k 2 = ^-f c (f c + l)(2/ c + 1) . 

k=i ^ 

The following inequality is well known : for v > 0, Y(v) < exp(— v 2 /2), it yields: 



" 1 / / " 2 x , //c(/c + l) / «* 



The result follows. □ 
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A. 3. Theorem 3.2. Let a = (fl;)£L n be the coefficients of a dual certificate P of A: 



P = £ afcflfc := (<E>, a) . 

The corresponding element of D(A, A) is given by the expression: 
d = ||A||Tv-||A||rv-»( / Pd(A- A)) . 

JT 

From the definition of BLASSO, we know that 

h\ f OdA-y||^+A||A|| TV < I|| e ||2 + A||A|| T v. 
2 JT I 

It holds: 

(A.2) \\\\ OdA-y||2+Ad + AK(/ Pd(A - A)) < i||e||| . 

I JT JT 2 

• From (A.2), we deduce the following inequalities: 

First inequality: Since y = J <E> dA + e, it follows that: 

h[ cl>d(A-A)||2-K((/ $d(A-A),£))+Ai 
2 Jt JT 

+A3?(( / <J>d(A- A), a)) < 0. 

And so: 

ill/ 4>d(A-A)+AS||?-»((/ $d(A- A), e)) +Ad < ^||Aa|| 2 . 
2 JT JT 2 

A simple calculation gives that: 

-II / <Dd(A- A) +Aa\\l+Xd < r||Aa||f+R« f <J>d(A - A) , e}) . 
2 Jt 2 Jt 

Eventually, we get: 

d<^Hl+^(</ T <&d(A-A), £ )). 

Second inequality: We have: 

OdA-y + AS|||+Ai< i||Aa||i+l||e||i-A( e/ S). 



Eventually, we get: 



A S A H £ II 2 



• If A > Ag := ||I]^Lo e /c < ?'/cl|oo then we have the following result. 
Lemma A.l. Under the same hypothesis as Theorem 3.2, it holds: 

<E>d(A — A)|||< 2A||A|| T y. 

Proof. From the definition of BLASSO, we know that 

i||^*dA-y||^+A||A|| Ty < i||£||^+A||A|| Ty . 

Since y = J <$> dA + e, it follows that: 

h\ [ *d(A-A)||£-»«/ *d(A-A), £ )) + A||A|| Ty < A||A|| T y. 
2 Jt Jt 
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By linearity, we have: 

1|| / T *d(A - A)||2< «( J (O, e) d(A - A)) + A(||A|| TV - ||A|| TV ) , 

where 

m 

(*, e) = £e£flk. 

fc=0 

Set A = ||(<E>, e)||oo then 

i|| / <J>d(A-A)||f< Ao(||A|| TV +||A|| TF ) + A(||A|| TV -||A|| T7 ) . 
Since A > Ao, it holds: 

" OdfA — A) ||1< 2A||A||rv. 



T 



□ 



From (A. 2), it holds: 

Ad < i||e||i-A3?(( / *d(A- A), a)). 
2 Jt 

Using the Cauchy-Schwarz inequality and the previous lemma: 

1 



d< || fl || 2 ^2A||A|| ry + — ||e||£. 
A.4. Theorem 1.1. We begin with a key lemma [CFG12b]. 

Lemma A.2 (Tight dual certificate [CFG12b]). Let T = {T lf . . . , T s } C [0, 1] be the 

support of the target measure A. We recall that: 

s 

A = E A k exp(i0 fc ) S Tk ■ 

k=l 

2 

JfA(T) > — then there exists a tight dual certificate P such that: 

fc 

fc 

Vie [0,1], P(x) = E «itexp(i27rfcx), 

k=-fc 

satisfying the following properties for allk = 1, . . . , s: 

• P(T k )=eaqp(-i6 k ), 

• Bound on the Taylor expansion at point T^: 



T 0.16 0.16 

Jc /C 



Vx G 

Bound on the complement: 



\P{x)\<l-C a fi(x-T k ) 2 , 



Vxe [0,1] \ [J + |P(x)|<l-Q, 

iw'f/z < Q, < 0.16 C fl < 1 universal constants. 

Proof of Theorem 1.1. The Rice method ensures that A > Ao := ||X3{L-/ £fc <Pfc 1 1 oo 
with a probability described in the statement of Theorem 1.1. The previous 
lemma shows that A is TSM and gives an explicit upper on the dual certificate P. 
The hypotheses of Theorem 3.2 are matched and we shall invoke (3.3) to get: 

D(A,A)<£||«-|||*, 
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Using the triangular inequality and Parseval's identity, it yields: 



■ e e n 

l«-ill 2 <H 2 +ll x ll2 



[0,1] 



P\ 2 (x)dx 



1/2 Iff f c j \l/2 

+ ^U£ f ; k(pkl {x)dx ) 



Since £jfc?>fcll<»< A and ||P||<»< 1, we have: 



(A.3) 

Finally, it holds: 
Now, let f e [0, 1] such that: 

Let d = D(A, A) then (A.3) gives: 



l«- X ll 2 <2. 



D(A, A) < 2A. 



A(»)l>^- 



<(0,16) 2 C fl , 



|A({f})l 

where C a is the same constant as in Lemma A. 2. The result follows invoking 
Lemma 3.1 and Lemma A.2. □ 
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